7 research outputs found

    The Green500 List: Escapades to Exascale

    Get PDF
    Energy efficiency is now a top priority. The first four years of the Green500 have seen the importance of en- ergy efficiency in supercomputing grow from an afterthought to the forefront of innovation as we near a point where sys- tems will be forced to stop drawing more power. Even so, the landscape of efficiency in supercomputing continues to shift, with new trends emerging, and unexpected shifts in previous predictions. This paper offers an in-depth analysis of the new and shifting trends in the Green500. In addition, the analysis of- fers early indications of the track we are taking toward exas- cale, and what an exascale machine in 2018 is likely to look like. Lastly, we discuss the new efforts and collaborations toward designing and establishing better metrics, method- ologies and workloads for the measurement and analysis of energy-efficient supercomputing

    Taming Multi-core Parallelism with Concurrent Mixin Layers

    Get PDF
    The recent shift in computer system design to multi-core technology requires that the developer leverage explicit parallel programming techniques in order to utilize available performance. Nevertheless, developing the requisite parallel applications remains a prohibitively-difficult undertaking, particularly for the general programmer. To mitigate many of the challenges in creating concurrent software, this paper introduces a new parallel programming methodology that leverages feature-oriented programming (FOP) to logically decompose a product line architecture (PLA) into concurrent execution units. In addition, our efficient implementation of this methodology, that we call concurrent mixin layers, uses a layered architecture to facilitate the development of parallel applications. To validate our methodology and accompanying implementation, we present a case study of a product line of multimedia applications deployed within a typical multi-core environment. Our performance results demonstrate that a product line can be effectively transformed into parallel applications capable of utilizing multiple cores, thus improving performance. Furthermore, concurrent mixin layers significantly reduces the complexity of parallel programming by eliminating the need for the programmer to introduce explicit low-level concurrency control. Our initial experience gives us reason to believe that concurrent mixin layers is a promising technique for taming parallelism in multi-core environments

    Accelerating Electrostatic Surface Potential Calculation with Multiscale Approximation on Graphics Processing Units

    Get PDF
    Tools that compute and visualize biomolecular electrostatic surface potential have been used extensively for studying biomolecular function. However, determining the surface potential for large biomolecules on a typical desktop computer can take days or longer using currently available tools and methods. This paper demonstrates how one can take advantage of graphic processing units (GPUs) available in today’s typical desktop computer, together with a multiscale approximation method, to significantly speedup such computations. Specifically, the electrostatic potential computation, using an analytical linearized Poisson Boltzmann (ALPB) method, is implemented on an ATI Radeon 4870 GPU in combination with the hierarchical charge partitioning (HCP) multiscale approximation. This implementation delivers a combined 1800-fold speedup for a 476,040 atom viral capsid

    GPU First -- Execution of Legacy CPU Codes on GPUs

    Full text link
    Utilizing GPUs is critical for high performance on heterogeneous systems. However, leveraging the full potential of GPUs for accelerating legacy CPU applications can be a challenging task for developers. The porting process requires identifying code regions amenable to acceleration, managing distinct memories, synchronizing host and device execution, and handling library functions that may not be directly executable on the device. This complexity makes it challenging for non-experts to leverage GPUs effectively, or even to start offloading parts of a large legacy application. In this paper, we propose a novel compilation scheme called "GPU First" that automatically compiles legacy CPU applications directly for GPUs without any modification of the application source. Library calls inside the application are either resolved through our partial libc GPU implementation or via automatically generated remote procedure calls to the host. Our approach simplifies the task of identifying code regions amenable to acceleration and enables rapid testing of code modifications on actual GPU hardware in order to guide porting efforts. Our evaluation on two HPC proxy applications with OpenMP CPU and GPU parallelism, four micro benchmarks with originally GPU only parallelism, as well as three benchmarks from the SPEC OMP 2012 suite featuring hand-optimized OpenMP CPU parallelism showcases the simplicity of porting host applications to the GPU. For existing parallel loops, we often match the performance of corresponding manually offloaded kernels, with up to 14.36x speedup on the GPU, validating that our GPU First methodology can effectively guide porting efforts of large legacy applications

    Defer Mechanism for {C}

    No full text
    International audienceThe defer mechanism can restore a previously known property or invariant that is altered duringthe processing of a code block. The defer mechanism is useful for paired operations, where oneoperation is performed at the start of a code block and the paired operation is performed beforeexiting the block. Because blocks can be exited using a variety of mechanisms, operations arefrequently paired incorrectly. The defer mechanism in C is intended to help ensure the properpairing of these operations. This pattern is common in resource management, synchronization,and outputting balanced strings (e.g., parenthesis or HTML).A separable feature of the defer mechanism is a panic/recover mechanism that allows errorhandling at a distance

    Defer Mechanism for {C}

    No full text
    International audienceThe defer mechanism can restore a previously known property or invariant that is altered duringthe processing of a code block. The defer mechanism is useful for paired operations, where oneoperation is performed at the start of a code block and the paired operation is performed beforeexiting the block. Because blocks can be exited using a variety of mechanisms, operations arefrequently paired incorrectly. The defer mechanism in C is intended to help ensure the properpairing of these operations. This pattern is common in resource management, synchronization,and outputting balanced strings (e.g., parenthesis or HTML).A separable feature of the defer mechanism is a panic/recover mechanism that allows errorhandling at a distance
    corecore